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A  PARSER  FOR  THE  ISO  82U  DATA  FORMAT 


The  ISO  8211  File  Format 

The  Imematioaai  Standards  Oiganization  (ISO)  has  defined  standard  8211  as  a  "data 
descrqitive  file  fmr  infonnation  interchange"  [IS085].  The  ISO  8211  fonnat  is  hierarchical. 
Teims  refeired  to  in  the  standard  document  ISO  8211-198S(E)  will  be  used  to  refer  to  die  sections 
of  this  hierarchy.  Each  file  starts  with  a  single  Data  Descriptive  Record  (DDR),  which  describes 
the  formats  of  the  Data  Records  (1%)  that  follow.  There  may  be  many  DR  sections.  All  user 
data  is  omtained  in  the  DR  sections. 

Both  the  DDR  and  each  of  the  have  a  similar  internal  structure.  Each  is  divided  into 
three  sectkms.  Each  begins  with  a  24-byte  leader  that  gives  the  sizes  of  the  sections  that  follow. 
The  leaders  are  followed  by  a  directory  that  gives  the  lengths  and  posititms  (offsets)  of  eadi  of 
the  data  fidds  to  be  found  in  die  final  section.  The  firud  sectkm  is  divided  iaU)  fields,  and  each 
field  is  given  a  mnemonic  alphanumeric  tag  for  identification.  These  tags  are  defi^  in  the  direc¬ 
tory  section.  The  DI^  and  each  1%  have  a  similar  directory  section,  but  the  final  section  of  eadi 
of  these  differs.  The  final  sectkm  of  die  DEX(  is  called  die  Data  Descriptive  Area  (DDA),  and  the 
final  section  of  each  I^  is  called  the  £/rer  Data  Area  (UDA).  As  its  name  imidies,  it  is  in  the 
UDAdiat  the  actual  data  being  transmitted  by  the  ISO  8211  file  is  stored. 

The  qipendix  to  this  rqxnt  contains  a  listing  of  the  C  programming  language  "indude"  file 
iso8211Ji  uMch  defines  data  structures  used  in  the  parser.  Refisr  to  this  listing  for  definitions  of 
structures  and  constants  mentioned  in  die  following  discussion. 

The  DDA 

The  DDA  contains  a  succession  of fields,  whidi  in  turn  have  sutfieUb.  Helds  are  sqiarated 
by  z  field  terminator,  whidi  is  the  hexadecimal  ASCII  character  le.  Subfields  are  separated  by  a 
unit  terminator,  v^di  is  dw  hexadecimal  ASCII  character  If.  The  subfields  are,  in  otder,  field 
controls,  data  field  name,  label,  and  format  controls.  Not  all  of  these  subfields  need  exist  hfiss- 
ing  subfidds  are  indicated  by  a  pair  of  consecutive  fidd  terminators.  The  miremonic  tags  from 
the  Dm  diredory  are  assig^  to  eadi  fidd  in  turn,  so  the  number  of  tags  in  die  directory  must 
be  the  same  as  die  mimber  of  fidds  in  the  DDA  The  tags  are  then  referred  to  in  eadi  DR  to  con¬ 
ned  the  data  in  die  DR  widi  die  data  in  the  DDR. 

Here  is  an  examine  of  a  DDA  fidd,  whidi  has  been  formatted  for  readability  by  breaking  it 
into  sqiarate  lines: 

1600;&  _ 

TESTJVaxrH_IDENTIFIER_.FIELD& 

PNM!DWV!RERPUR!PIR!PIG!PIB& 

(A(7)4(6)4l(5)3(5)J(3)4(3)J(3)); 

The  first  line,  "16(X);&".  is  die  field  controls.  The  number  teds  vdiat  type  of  field  this  refers  to 
(here  a  mixed  vector  field)  and  the  characters ":"  and  "A"  tell  us  that  diese  diaracteis  may  be  used 
as  printed  representations  of  the  fidd  termintuorand  the  unit  terminator  reqiectivdy,  as  is  done  in 
the  listing  a^e.  The  second  line  is  an  idemifying  name  terminated  by  a  unit  terminator.  The 
third  line  is  a  label,  which  is  in  diis  case  a  vector  label  consisting  of  a  series  of  subfidd  labels  for 
die  DR.  Each  subfield  labd  is  sqiarated  from  its  nei^ibor  1^  a  "!".  The  fourdi  and  final  line  has 
dse  format  controls  subfidd  whidi  qiecifies  die  format  of  data  in  die  UDA  by  udng  a  POR- 
TRAN-like  syntax.  The  format  controls  subfidd  is  delimited  by  a  field  terminator.  The  DDA 
fidd  as  a  whole  is  also  delimHed  by  a  field  terminator. 
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For  a  vector  label  such  as  this  examine,  each  vector  subfield  tag  is  associated  with  one  of 
the  data  formats  in  the  fonnat  omtiol.  Theiefore,  subfield  tag  "PNM"  refers  to  an  ASCII  string 
that  is  seven  characters  (bytes)  long;  tag  "DWY"  refers  to  an  integer  that  is  encoded  as  a  string  of 
six  ASCn  numeric  diaracters;  and  tag  "REF"  refers  to  a  floating  point  number  encoded  as  a  string 
of  five  ASCn  numeric  diaracters  with  floating  poim  characters  such  as "."  also  allowed. 

TheUDA 

The  final  section  of  each  DR  is  called  the  User  Data  Area  (UDA).  It  is  here  that  the  useful 
data  being  transmitted  by  the  ISO  821 1  file  is  stored.  Following  the  leader,  the  directory  section 
of  eadi  Kt.  contains  tags  that  must  also  appear  in  the  DDR  directory.  These  tags  then  index  the 
cone^XMiding  entry  in  the  DDA,  which  t^  how  to  read  the  UDA  by  supplying  formats.  A 
given  tag  in  the  DDR  may  be  referenced  any  mimber  of  times  in  a  DR. 

For  aU  the  details  of  ISO  8211.  refer  to  the  standards  document  Enou^  of  ISO  8211  has 
now  beenjxesented  so  that  we  can  understand  the  important  data  structures  used  in  this  parser. 

The  Parser 

This  parser  runs  under  the  UNIX  operating  system,  although  it  has  been  written  to  be 
portable  to  different  environments.  It  works  by  bufldfog  lists  of  structures  that  can  then  be  inter¬ 
rogated  by  programs  that  need  the  data  to  read  ISO  8211  files.  The  parser  moves  forward  through 
the  file  being  parsed.  For  examine,  when  the  parse  of  a  DR  directory  is  cmnplete  the  file  pointer 
is  at  the  beginning  of  die  associated  UDA.  Every  parsing  program  takes  a  stream  pointer  to  the 
file  being  read.  In  the  C  programming  language  this  is  declared  as  a  (FILE  *)  type,  such  as 

FILE*fl?; 


The  parser  turns  input  data  into  a  set  of  lists.  Lists  are  formed  fnmi  C  data  structures  linked 
together  using  a  pointer  ttiat  is  part  of  the  structure.  This  pointer  is  always  tunned  next.  Li^are 
always  tmminated  by  a  NULL  pointer.  Any  type  of  list  may  then  be  traversed  using  C  code  sudi 
as  die  fonowing.  which  traverses  list  "foo"  by  using  a  user-defined  pointer  ruuned  "foop": 

for  (foq;)  =  foo;  foop  I-  NULL;  foop  s  foop->next) 

( 

/*  do  smnething  with  elements  of  list  foo  V 

} 

All  data  structures  used  by  the  parser  ate  defined  in  file  iso8211Ji,  which  is  listed  in  the 
appendix.  Data  from  die  DIXl  is  par^  into  a  list  based  on  a  C  structure  called  ddajtntry  and 
daia  from  each  IXl  is  parsed  into  a  list  based  tm  a  C  structure  called  udajentry.  These  two  lists 
ate  the  things  that  a  user  of  the  library  will  be  ctmcemed  with. 

To  make  die  ddajentry  list,  call  die  function  parse jidr(),  udiich  returns  a  pointer  to  the 
head  of  the  ddajentry  list  Similariy.  to  parse  the  next  DR.  call  function  parse_dr(),  uducfa 
returns  a  pointer  to  die  head  of  a  udajentry  list  Each  of  these  lists  will  now  be  discussed  in  turn. 

The  Parser:  DDR  Section 

The  dda_entty  list  is  only  parsed  once  since  there  is  rnily  one  DDR  in  an  ISO  8211  file. 
Rmcdoo  parse  jidr()  takes  a  sin^  argument  a  pointer  to  the  open  ISO  8211  file  being  parsed, 
and  retums  a  lift  of  ddajentry  structures.  This  list  is  then  searched  for  matching  tags  udien  pars¬ 
ing  each  DR.  Thedda_entrystnictureis: 
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typedef  stnia  dda_entry 

{ 


int  structure.type: 
int  data_type; 
char  *iiame: 
char  *tag; 
int  labeLtype; 
union  label  *label; 


/•  ELEMENTARY.  VECTOR.  ARRAY  •/ 

/*  INT.  FLOAT.  EXP.FLOAT. ...  •/ 

I*  long  descriptive  name  V 

/*  same  as  in  coneq)ondii)g  ddr.oitiy  */ 

r  VECT.  CARTESIAN.  ARRAY.DESC  •/ 


struct  format  *fonnat; 

struct  format  *tepeat;  /"  indicate  rqseatirtg  part  of  format  list  */ 
stnict  dda_entry  *next; 

)  dda_entry; 


The  first  two  members  of  this  structure,  stntcturejype  and  datajype,  hold  enumerated  types, 
which  can  be  found  in  die  iso8211.h  file.  The  third  a^  fourth  members,  name  and  tag,  refer  to 
the  long  ruune  for  the  entry  and  the  short  tag  by  which  it  will  be  referenced.  The  fifth  member. 
labeljype  is  an  enumerated  type  code  for  the  type  of  label  found  in  the  next  member,  label.  The 
label  member  is  a  poiruer  to  a  union,  which  stores  a  type  of  data  indexed  by  the  labeljype  mem¬ 
ber 

typedef  union  label 

(  a  label  will  be  one  of  three  types*/ 

struct  vector  ^vector, 
struct  cartesian  *cattesian; 
struct  amyjdesc  *desc; 

)  label; 


the  UAel  union  can  contain  pointers  to  structures  rqxesenting  the  three  types  of  label  siqtported 
under  ISO  8211: 

typettef  strua  vector 

{ 

char  *tag; 
struct  vector  *next; 

}  vector; 

typedef  struct  cartesian 

{ 

stnict  vector  *rows; 
strua  vector  *ools: 

strua  vectors  *vecs;  /*  higher  dimensions  if  needed*/ 

}  faiftwrian; 

typedef  strua  array.desc 

{ 

int  lengdi;  lengdi  of  a  dimension  */ 

strua  array.desc  *next: 

)  arrayjdesc: 
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The  cartesian  ^ctuie  refers  to  a  list  of  these  stnictiuts; 
typedef  struct  vectors 

{  /*  needed  for  cartesian  labels  more  than 

•2D*/ 

vector  ♦vcc; 
struct  vectors  *next; 

}  vectors; 

The  vectors  structure  allows  multidimensional  arrays  to  be  stored  as  lists  of  vector  structures. 
Structure  array.desc  stores  an  array  descriptor,  a  rather  strange  label  that  indicates  the  dimen¬ 
sions  of  an  array,  which  will  follow  in  the  UDA.  See  the  standard  for  an  explanation  of  array 
descriptors. 

The  final  two  members  of  a  ddajstary  structure  are  pointers  to  a  list  of  format  structures: 
typedef  struct  format 
I 

int  type;  /*  INT.  FLOAT,  EXP.FLOAT, ...  */ 

int  length;  /*  either  this  or  delimiter  must  be\000  */ 

char  delimiter, 
struct  format  *next; 

)  format; 


The  UDA  data  may  be  delitnited  by  eidrer  specifying  its  length  or  by  specifying  a  delimiter  char¬ 
acter,  which  may  not  sppear  in  the  data  itself.  The  format  structure  allows  ftxr  each  of  these 
delimiting  techniques,  aldwugh  at  least  one  of  die  members  length  or  delimiter  must  be  zero  (for 
this  parser,  binary  zero  is  therefore  not  allowed  as  a  ddimiter).  The  8211  standard  says  that  if 
both  length  and  delimiter  are  zero,  the  data  elements  are  separated  by  unit  termirurtors. 

There  are  two  format  pointers  in  die  ddajsntry  structure  because  the  format  is  defined  to 
imjdidtly  repeat  the  last  parenthesized  expression  at  its  right  end.  A  repeat  pointer  is  needed  to 
allow  data  to  be  read  using  this  implicit  repeating  format 


The  Parser:  DR  Sections 

As  moitioned,  there  is  only  one  DDR  in  an  ISO  8211  file;  so  the  DDR  section  only  needs  to 
be  parsed  once.  There  may  be  many  DR  sections  however,  so  parsirtg  of  the  1%  is  done  by  a  sep¬ 
arate  program  that  is  caU^  as  many  times  as  needed.  Program  parse_dr()  has  one  argument;  a 
pmnter  to  the  file  being  parsed.  Parse_dr()  returns  a  list  of  uda_entry  structures: 


typedef  struct  uda_entty 

I 


char  ^fieldjtag; 
diar  *vec_tag; 
char  type; 
union  { 
char*q); 
inti; 

douUed; 
int*bf, 
void  ^ignore; 


/*  lengdi  is  up  to  field  terminator  */ 
/''lengdi  is  up  to  next  vector  item  */ 
/•A4Jl,S,C3.orX*/ 

/*  CHAR  (actually  a  string)  */ 

/•INT*/ 

/•  FLOAT,  EXP_FLOAT  •/ 

/•  BITFIELD,  CHAR.BIT.STRING  */ 
/•IGNORE*/ 
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/*  user  data.  */ 


}data; 

stiuct  uda_emiy  *next; 

}  uda_entiy; 

The  field  jag  member  corresponds  to  field  tags  in  the  dda.entiy  stmcture  and  is  to  find  a 
corresponding  entry  in  die  dda.enoy  list  Member  vecjag  is  one  of  the  vector  subfield  tags  men¬ 
tioned  above  in  the  discussion  of  the  DDA  and  is  used  to  find  the  exact  match  for  a  format  from 
the  format  list  associated  with  each  item  in  the  dda.entry  list  The  type  member  is  a  diaracter 
indicating  the  data  type,  which  will  be  stored  in  this  instance  of  the  uda_entry  structure.  The  data 
member  is  a  union  whose  type  is  indexed  by  the  type  member. 

Besides  the  high_level  functions  parse jidr()  and  parse jiii),  there  are  lower  level  parsers 
available  for  those  cases  when  more  control  is  needed.  These  are 

extern  struct  ddr  Jeader  *parse_ddr_leaderO; 
extern  struct  ddr.entry  *patse_ddr_directoryO: 
extern  struct  ddSLentry  *parse_ddaO; 

whidi  separately  parse  the  three  main  sections  of  the  DDR,  and 

extern  struct  drjeader  *patsejdrJeadeiO; 
extern  strua  dr.entry  *par8e_drjdirectoty0: 

which  parse  the  first  two  sections  of  a  See  file  iso8211.h  for  definititHis  of  the  structures 
referred  to  in  these  function  declarations.  The  UDA  is  too  variable  to  support  a  parser  in  tlds 
library;  the  user  of  the  library  must  define  one.  The  code  that  follows  gives  an  examfde  of  this. 

An  Example 

Here  is  an  example  of  C  code  that  uses  the  programs  parse  jidr()  and  parse  jlr(): 

#include  <stdio  Ji> 

#include  <iso821 1.h> 

main(argc,  argv) 
int  argc; 
char  ♦♦argv; 

I 

stiuct  ddajentiy  ♦dda  =  NULL; 
stiuct  drjentiy  ♦dr = NULL,  ♦dip; 

FILE^fl?; 


dda  s  parse_ddr(fp); 

while  (1)  /*  do  until  EOF  ♦/ 

I 

drspaise_di<fp); 

/♦  Do  smnething  in  here  with  data  from  DR,  if  desired.  ♦/ 

/♦  last  dr  element  has  die  seek  information  we  need  to  go  past  uda  ♦/ 
for  (dip  s  dr,  dip->next  !=  NULL;  drp  =  dip->next) 
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I*  for  now,  seek  past  ttie  uda  */ 
if  (fsedc(fp,  (lon^<lrp->pc  ion  +  drp->length),  1)  =  -1) 
exit(0); 

I* 

*  No  pafse.udaO  function  is  defined  heie  because  the 

*  user  data  area  (uda)  may  contain  many  types  of  structures  and  the 

*  parse  is  tiherefore  data-dependenL 

•/ 

) 
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A  Longer  Example 

As  the  last  comment  in  the  code  above  ^ws,  there  is  no  parse_uda()  function  defined  in 
the  library.  The  library  takes  care  of  those  parts  of  ISO  8211  that  are  not  data-dependenL  The 
user  of  this  library  should  write  UDA  parsers,  as  needed,  based  upon  the  information  retrieved  by 
the  parsers  described  here.  A  final,  rather  long,  example  showing  such  usage  is  this  section  of 
code  from  a  parser  of  ARC  Digitized  Raster  Graphics  (ADRG),  a  product  of  the  U.S.  Defense 
Mifiping  Agency: 


/*********•***  TRANSMTITAL  HEADER  FILE  **•••*****•*/ 
t* 

*  Transmittal  Header  File  always  has  the  same  name,  so  just  open 

*  the  one  in  die  currertt  directory.  Frmn  the  THF  we  want  filenames 

*  and  the  comers  of  die  Distribution  Rectangle  in  laUtm. 

•/ 

if  ((fp  =  fopenCTRANSHOl.THF,  "r"))  =  NULL) 

{ 

fjprintf(stdetr,  "File  open  error.  atgv[l]); 

exit(l); 

} 

dda  s  patse_ddr(fy);  /*  parse  the  file  directory  */ 

/*  THF  has  4  records.  First  record  contains  comer  coords  of  image.  */ 
uda  a  parse_next_.dr(dda,  fp): 
gBt_oomers(&nw,  &se,  uda); 

/*  Next  two  dr  records  contain  nothing  of  interest 

*  (security  and  test  patch  respectively). 

*/ 

parse_next_dr(dda,  fp); 
paise_next_di(dda,  ^); 

/*  Next  (and  last)  dr  record  contains  the  filenames.  Build  and 

*  return  a  directory  tree. 
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*/ 

uda  s  parse.next.drCdda.  Q>); 
root  s  i)arse_directoiy(uda): 
fclose(^); 

/*  Do  similar  things  with  other  files  */ 

The  library  function  parse _ddr()  is  used  by  this  code,  but  the  programmer  has  eiK^apsulated 
the  parse  jbi)  function  in  a  function  of  his  own  called  parsejnext_dr(): 

/• 

*  return  a  uda  list  associated  with  current  dr. 

*/ 


#include  <stdioJi> 

#include  <iso8211.h> 

uda.entry  * 
parse_next_dr(dda,  fp) 
dda_entry  ^dda; 

FILE  *fp; 

( 

dr_eniry  *dr,  *drp; 
dda.entry  *dap; 

uda.entry  *uda,  *temp,  *head  s  NULL; 
imc; 

extern  udajentry  *patse_vecO; 
extern  udajoitry  *patse_cartO: 
extern  uda.entry  *patse_descO: 

dr  =  patse_dr(fjp);  /*  parse  leader  and  dr  directory  */ 

for  (dip  =  dr,  NULL;  dip  s  drp->next) 

{ 

for  (drq>  s  dda;  dq)  !=  NULL  &&  strcmp(dip->tag,  dap->tag)  !=  0; 
dq>  =  dap->next) 

;  /*  find  match  in  dda  to  get  format  */ 

if  (dap  ==  NULL)  y*  no  match;  an  error*/ 

{ 

fprintf(stderr,  "No  match  found  for  tag  dip->tag); 
break; 

} 

switch  (dap->label_type) 

{ 

case  0:  /*  no  label;  there  should  be  a  functitm...  */ 

break; 
case  VECrt 

uda  =  parse_vec(dip,  dap, 
break; 
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case  CARTESIAN: 
uda  s  paise_cait(dip,  dap,  Q)); 
break; 

caseARRAY.DESC: 
uda  =  parse_desc(dip,  dsf).  fp); 
break; 
default; 

fprintf(stderr,  "No  such  label  type:%dsn",  dap->label_type); 
break; 

} 

if  (head  =  NULL) 

{ 

heads  uda; 

for(temp  =  uda;  tanp->next  !=  NULL;  temp  =  temp->next) 

a 

) 

else 

for(temp<>next  =  uda;  temp->next  !=  NULL;  temp  =  temp->next) 

» 

wlule  ((c  =  getc(fp))  =  UNTT.TERM  II  c  =  FIELD_TERM) 

;  /*  test  next  char  to  see  if  should  skip  */ 

ungetc(c,  ^); 

} 

return  head; 

} 

Function  parsejtextjbi)  in  turn  calls  application-specific  UDA  parsers  fiud  were  written  to 
ccxifbtm  to  the  ADRG  format  Here  is  one  of  them: 

udajentry  * 
paise.ve^dr,  dda,  ip) 
dr_entry  *dr, 
dda_entty  *dda; 

FILE 

{ 

uda_entry  *uda,  *temp.  "'head  =  NULL; 
vector  *vec; 
format  '"fint; 

extern  void  get_data_valueO: 

for  (vec  =  dda->label->vector,  fmt  =  dda'>fotmat; 
fint  !=  NULL;  fint  =  fiBit->next,  vec  =  vec->next) 

{ 

temp  s  (uda_entry  *)  malloc(sizeof<udajentry)); 
temp->vec_tag  »  ma]loc(strlen(vec->tag)  1); 

8trcpy(temi>->vec_tag,  vec->tag); 
temp->fidd_tag  =  malloc(sttlai(dda->tag)  + 1); 
strcpy(iemp->field_tag.  dda->tag); 
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temp->next  =  NULL; 

get_data_value(fmt.  temp,  fjp);  t*  temp  and  fp  changed  *! 
if  (head  =  NULL) 
head  =  temp; 
else 

uda->next  =  temp; 
uda  =  temp; 

} 

return  (head); 

1 

Function  parse jfec()  builds  a  list  of  stnictures  specific  to  the  ADRG  format.  It  uses  the  lists  of 
dda.entiy  and  of  uda_entry  structures  from  the  library  functions  parse_ddr()  and  parse jir()  to 
help  in  finding  UDA  data.  Obtaining  the  UDA  data  is  done  by  function  get_data_value(): 

vmd 

get_data_value(fmt.  uda.  fjp) 
format  *fint; 
uda_entiy  *uda; 

FILE  *fip; 

{ 

irtt  c; 
char  *buf, 

buf  a  malloc(fmt*>length  +  1); 

while  ((c  =  getc(Q)))  =  UNTTJTERM  II  c  =  F1ELD_TERM) 

;  /*  test  next  char  to  see  if  should  skip*/ 

ungetc(c.  fjp); 

if  (fread(buf,  1,  fint->length.  fjp)  !=  fint->length) 

{ 

^tintf(stderr.  "Read  error  in  get_data_value\n”); 
return; 

} 

buf[fint->]engdi]  =  ’  ’;/*  for  string  operations  */ 
uda->type  =  fint->typp 
switch  (fint->9pe) 

{ 

case  T;  integer  */ 

uda->data.i  =  atoi(buf); 
break; 

case ’A’:  /*(char*)*/ 

uda->data.cp  =  trudloc(fiiit->iength  +  1); 
strq>y(uda>>data.q),  buf); 
bre^ 

case  11’:  /*  real  number  */ 

case  ’S’:  I*  exponential  real  number  */ 

for  (c  «  0;  buflc]  !*  ’  ’;  ++c) 

if  (buf[c]  sss  ’D’)  I*  FORTRAN  indicator  of  exponential  */ 
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/*  C  indicator  of  exponential  */ 


buflc]  =  ’e’; 
uda->data.d  =  atof(buf); 
break; 

default:  /*  no  other  data  type  legal  in  adig  */ 

QmntfCstderr,  "Data  type  %d  UlegaNi",  fint->type); 
break; 

} 

} 


This  com|detes  the  example.  Study  of  this  example  will  show  at  least  one  way  of  using  the 
basic  data  structures  returned  by  parse_ddr()  and  parse_dr{)  to  read  die  UDA.  It  is  not  advisable 
to  use  uda^entiy  stiucuues  for  all  user  data.  Laige  arrays,  for  example,  should  be  read  direcdy 
once  other  information  has  been  extracted  from  the  ISO  8211  file. 

Discussion 

The  parser  is  used  by  including  file  iso821JJi  in  your  program  and  linking  the  program  with 
library  iso82Il.  In  a  C  program,  this  linking  is  done  by  a  cmnmand  such  as; 

oc  -o  myprog  myprog.c  -liso82Il 

The  C  {xograms  to  build  the  library  are  availal^  from  the  author  at  the  internet  address  of 
mike@tec.aimyjnil,  or  for  anonymous  ftp  from  pooh.tec.aimyjnU  as  compressed  tar  file 
pubi/iso8211.tar.Z.  There  is  no  charge  for  this  code.  The  programs  are  all  unrestricted  and  in  the 
public  domaiiL  The  code  also  includes  an  example  parser  written  using  the  library.  Some  of  the 
programs  used  in  this  report  are  from  die  example  parser. 
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Appendix 


This  is  a  listing  of  the  include  file  iso82JlJi.  This  file  defines  constants  and  data  structures 
used  in  the  parser  and  declares  the  functions  available  to  a  user  in  the  iso  library.  Tne  comments 
in  the  file  explain  each  entry. 

#iihdefIS08211_H 
#define  IS08211_H 

/• 

*  This  file  and  associated  programs  were  written  by  Mike  McDonnell 

*  of  the  U.S.  Army  Topo^'aiMc  Engineering  Center  (mike@tec.atmyjnil). 

*  They  are  in  die  public  domaia  Please  retain  this  comment 

•/ 


/*  ddr  and  dr  leaders  are  of  a  fixed  length;  24  bytes.  */ 

#define  LEADER.LENGTH  24 

I* 

*  Field  and  unit  terminators  are  used  dirou^ut  IS08211  files.  The 

*  term  "unit"  means  a  subfield  vnthin  a  larger  fidd. 

V 

#define  FIELD.TERM  ’  36’  /*  ctrl-“  */ 

#define  UNIT_TERM  ’  37’  /*  ctii-_  */ 

/* 

*  These  are  mnemtmic  macros  showing  what  the  various  dda_entiy.conm>ls 

*  data  types  are.  Besides  these  numeric  values,  the  trailing  chars 

*  and/or  indicate  diat  diese  printable  diais  may  be  used  as 

*  printed  iC]»esentations  of  UI^.TERM  and  FSLD.TERM  respectively. 

* 

*  The  IS08211  document  describes  numeric  data  types  as  "implicit  point" 

*  for  integers,  "explicit  point"  for  floats,  and  "scaled  explicit 

*  point"  for  floats  in  scientific  notation.  I  have  used  the  more 

*  mnemonic  names  of  "INT",  "FLOAT,  and  "EXP_FLOAT  for  diese  numeric 
♦types. 

♦/ 

/*  The  first  char  is  die  stnicture  type  */ 
enum  structure_type 
{ 

ELEMENTARY. 

VECTOR. 

ARRAY 

}; 

The  second  char  is  die  basic  data  type  ♦/ 
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enum  datatype 

{ 

CHAR. 

INT, 

FLOAT. 

EXP_FLOAT. 

CHAR_Brr_STRING. 

BITFIELD. 

IGNORE 

}; 


I*  label  types;  make  numbeis  big  to  stay  out  of  way  of  lex’s  defaults  */ 
enum  label.type 
{ 

VECT  =  3. 

CARTESIAN  =  4. 

ARRAY_DESC  =  5 

}; 


/* 

*  The  ISO  8211  file  consists  of  a  data  descr^ve  leoHd  (ddr) 

*  followed  by  data  recoids  (dr).  This  section  describes  the  stnictures 

*  the  ddr.  The  ddr  in  turn  describes  the  stnictures  of  the  dr. 

*/ 


*  The  data  definititm  recoid  (ddr)  leader  is  of  fixed  fimnat;  24  tqnes 

*  long.  I  use  a  standard  trick  (for  me)  of  defining  an  ascii  struct  to 

*  overlay  die  data  in  die  buffer  as  read  and  then  define  a 

*  corre^nding  strua  in  which  ascii  elements  are  qiproimalely 

*  converted. 


•/ 


typedef  strua  asdi.ddr.leader 

{ 

diar  recordJength[5];  /*  total  Imgth  of  ddr  including 

*  terminator*/ 


diar  interchangejevel[l];  /*  3  levels  are  defined;  1. 2. 3  */ 

char  leaderjd[l];  /*  T’ for  ddr  leader*/ 

char  extension_flag[l];  /*  ’E’  for  extended  char  sets;  else  ’  *  */ 

char  resl[l];  /*  reserved;  ’  ’  fw now  */ 

char  ap|^cadon_fiag[l];  preserved;”  fa  now*/ 

char  field_contrDlJength[2];  /*  bytes  in  ddf  for  type  and 

*  strua  coto;  also  used  in 


*df7*/ 


diar  ddOMse[S];  /*  offea  of  dda  in  ddr  */ 

char  extBnded[31;  /*  specify  extended  char  sets;  dse  ’ 

*’*/ 

dur  length.size[l];  /*seebdow*/ 
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char  position_size[l];  /*  see  below*/ 

char  res2[l];  /*  reserved;  ’0’  for  now  */ 
char  tag_size[l];  /*  see  below*/ 

}  asdLddr.leader, 

typedef  struct  ddrjeader 
{ 


int 

record Jength; 

/*  total 

length  of  ddr  induding 

*  terminator  */ 

im 

interchange Jevd; 

/*  3  levels  ate  defined;  1, 2, 3  */ 

char 

leaderjd[2]; 

/*  'L'  for  ddr  leader  */ 

char 

extension_flag[2]; 

/*  ’E’  for  extended  char  sets;  else  ’  ’  */ 

char 

resl[2]; 

/*  reserved;  ’  ’  for  now  */ 

char 

ap(dication_flag[2]; 

/*  reserved;  ’  ’  for  now  */ 

int 

fidd.controljength; 

/*  bytes  in  ddf  for  type  and  struct 

♦codes*/ 

int 

dda.base; 

/*  of^t  of  dda  in  ddr  */ 

char 

extended[4]; 

/*  spedfy  extended  diar  sets;  else  *  ’  */ 

int 

length_size; 

/*  see  bdow  */ 

int 

position_size; 

see  bdow  */ 

int 

res2; 

/*  reserved;  ’0’  for  r»w  */ 

int 

tag_size; 

/*  see  bdow  */ 

}  ddrjeader, 

/• 

*  Notice  that  many  of  these  structs  have  a  "iKxt”  pcnnter  and  so  are 

*  designed  to  make  lists.  As  a  convention,  I  do  not  store  the  lotgth 

*  of  these  lists,  lb  find  length  of  lists,  just  traverse  tiiem  and 

*  count  the  traversals.  This  is  not  a  very  expensive  operation  and  it 

*  keqrs  the  data  structures  simple. 

*/ 


/• 

*  A  linked  list  of  these  structs  constimtes  the  ddr  directory.  There 

*  is  a  (me-ttMme  correspondence  between  tire  ddrjentry  structs  and  the 

*  conesptmding  dda  structs  as  described  bdow.  The  thh  regirm  is 

*  terminated  witii  a  FIELDJTERM  (ctrlO- 

* 

*  Held  tags  of  0  and  1  ate  reserved  for  the  filename  and  the  record  ID 

*  name  respectively,  length’ is  the  total  length  of  the  dda  field 

*  (see  bdow)  induding  terminator  diaractets.  ’posititm*  is  the  of&et 

*  of  tire  <hla  fidd  from  tiie  start  of  the  dda  area. 

•/ 

typedef  struct  ddr.entty 

{ 

dur  *tag;  /*  length  gmten  from  tag_size  in  leader*/ 

int  length;  /*  ascii  length  gotten  from  length_size 

*  in  leader*/ 
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int  position;  asdi  length  gotten  from 

*  position.size  in  leader  */ 

struct  ddr.entiy  *next; 

}  ddr.entry; 


/• 

*  This  is  die  data  descriptive  area  (dda)  of  the  ddr. 

* 

*  The  length  of  the  dda  list  will  be  the  same  as  the  length  of  the 

*  ddr.diiectoiy  list  above. 

*/ 

n 

*  Vector  labd  tags  are  separated  fn»n  eadi  other  by  a and  fbnnats 

*  are  in  parentheses  to  be  aUe  to  build  up  a  tree  structure  as  in 

*  LISP.  Format  specification  is  as  in  FORITiAN  with  repeat  specs  like 
*41(7)  to  specify  four  integer  fields  of  7  ascii  numeric  characters 
*ea^  See  the  standard  for  the  (messy)  details  of  the  format  qiec. 

*/ 

typedef  struct  vector 

{ 

char  *tag; 
struct  vector  *next: 

)  vector; 

typedef  strua  vectors 

{ 

vector  •vec; 
struct  vectors  *ne]tt; 

)  vectors; 

typedef  struct  cartesian 

{ 

struct  vector  *tows; 
struct  vector  *ools; 

struct  vectors  *vecs;  /*hi^r  dimensions  if  needed*/ 

}  cartesian; 

typedef  struct  array.desc 

{ 

irtt  lengdi;  /*  length  of  a  dimension  */ 

struct  amy.desc  *next; 

)  amy.desc; 

typedef  union  label 


needed  for  cartesian  labels  more  than 
*  2D  */ 
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*** 


/*  a  label  will  be  one  of  three  types  */ 


{ 

struct  vector  ^vector, 
stiua  cartesian  ^cartesian: 
stiua  airay.desc  *desc; 

}  label; 


The  format  list  will  be  circular  at  its  end  sirtce  it  must 

*  automatically  repeat  within  foe  last  set  of  parens.  Rather  than 

*  actually  make  the  list  circular.  I  define  a  pointer  to  the  repeating 

*  part  of  foe  list,  ufoich  always  repeats  to  foe  end. 

* 

*  An  interesting  twist  in  format  is  firand  here  in  that  data  may  be 

*  delimited  as  well  as  of  a  fixed  length.  Thus  A(.)  means  a  string  of 

*  ASen  (foaracters  delimited  by  a  cemuna.  Dau  may  be  either  delimited 

*  or  have  a  fixed  length.  Therefore  at  least  one  of  the  members 

*  length"  or  "delimitet"  must  be  zero.  They  may  also  both  be  zero  for 

*  data  delimited  by  UNTTJTERM. 

•/ 

typedef  struct  format 

{ 

int  type;  /*  INT,  FLOAT.  EXPJlX)Ar. ...  */ 

int  length;  /*  either  this  or  ddimiter  must  be  00*/ 

char  deliffiftet; 
struct  format  *next; 

)  format; 

^pedef  strua  ascii_dda_entty 

{ 

dur  *conttols;  /*  lengfo  is  gotten  firnn  header 

*  field.controljmigfo  */ 

diar  *ruHne;  /"lengfo  up  to  termiruttor*/ 

char  "label;  /"length  to termiruttor*/ 

diar  *fotmat;  /"length  up  to  terminator*/ 

struct  ascii.dda_entry  *next; 

}  ascii_dda_entty; 

typedef  strua  ddajentry 

{ 

int  structore_type;  /"  ELEMENTARY.  VECTOR,  ARRAY  */ 

int  datajype;  /*  INT,  HX)Ar,  EXP_FLOAr, ...  */ 

char  *iuBne;  /"  long  descriptive  name  */ 

char  *tag;  /"  same  as  in  coneqxmding  ddr.emry  */ 

int  labeLtype;  /*  VECT.  CARTESIAN.  ARRAY.DESC  */ 

unioD  label  *label; 

strua  format  "fmmat; 

strua  fiaemat  *rq)eat;  /"  indicate  rqreating  part  of  format 

♦list*/ 
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stnict  <kU_eiitiy  *next; 
}  dd«.entiy; 


/* 

*  The  ISO  8211  file  omsists  of  a  data  descriptive  record  (ddr) 

*  ftdlowed  by  data  records  (dr).  This  section  describes  the  basic 

*  structure  of  all  dr.  The  ddr  describes  die  dmiled  structures  of 

*  each  dr  regitm.  See  above  for  data  structures  of  the  ddr. 

* 

*  The  data  record  (dr)  leader  is  of  fixed  format;  24  bytes  long. 

* 

*  Standard  tridt  here:  make  an  all^ascii  idrua  to  overlay  on  the  input 

*  buffer  and  pick  up  the  fields,  then  have  another  strua  widi  the  same 

*  field  names  nriucii  are  now  integers,  etc  as  appnpnatt.  Note  that 

*  even  single-character  fields  are  saved  as  string  so  that  stmcmpQ 

*  may  be  used  ooosistendy  fior  all  comparisons. 

*/ 


typedef  struct  asdi.dr.leader 

{ 

char  reooid_lengtb{5];  total  lengdi  of  dr  */ 

diar  re8l[l]:  T  reserved;  ’  ’  fw now  */ 

diar  leader.idfl];  A  'D’  foronoe;  *R’  forrepeat  */ 

diar  res2[5];  /*  reserved;  S  spaces '  ’fornow*/ 

diar  datB_baae[S];  /*offeetofuserdataarea(uda)indr*/ 

diar  res3(3];  /*  reserved:  3  spaces  ’  ’fornow*/ 

dttr  lengdijBize[l];  see  below*/ 

duff  positioQ_size[l];  see  below  */ 

diar  ies4[l];  reserved;  ’0’  fornow  */ 

duff  tag_size[l];  see  below  */ 

)  asdijdrjeader. 


typedef  struct  drjeader 

{ 


int  recordjengtli: 
duff  resl[2]; 
char  leadCTjd[2]; 
duff  ics2[6]; 
int  dataJb^ 
dUff  ie83[4]; 
int  lengfoL^ize: 
int  positiODLaize: 
int  res4; 
int  tagjize; 


f*  total  length  of  dr  */ 

/*  reserved;  ’  *  fornow  */ 

/•  ’D*  for  once;  ’R’  forrepeat  */ 
reserved;  S  juices  ’  ’fornow*/ 

ofiEset  of  user  data  area  (uda)  in  dr  */ 
reserved; 3 spaces ’  'fornow*/ 

/•  see  bekiw  */ 
see  below*/ 

/*  reserved;  *0*  fornow */ 
/►seebdow*/ 


}  drjeader; 
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/• 

*  A  linked  list  of  these  stnicts  constitutes  the  dr  directory.  There  is 

*  a  correspondence  between  the  dr.entry  stnicts  and  the 

*  uda  (user  data  area)  streets  as  describ^  below. 

*  CoiresptMiding  streets  are  matched  by  the  "key"  member  in  dr.entry 

*  and  the  "field.tag"  member  in  uda.entry.  The 

*  directory  region  is  terminated  with  a  FIELD.TERM 

* 

*  Tengdi’  is  the  total  length  of  the  uda  field  (see  below)  including 

*  terminator  characters,  'position*  is  the  offset  of  the  uda  field  from 

*  the  start  of  the  uda  area. 

* 

*  This  is  exactly  the  same  as  a  ddr.entry  struct  I  may  combine  them 

*  some  day.  I  just  didn't  realize  that  they  were  the  same  until  I  was 

*  done  wi A  the  parser.  Keeping  them  separate  makes  it  easier  to 

*  keep  the  names  of  things  separate  anyway. 

*/ 

^pedef  struct  dr.entry 

{ 

char  *tag;  length  gotten  from  tag_size  in  leader  */ 

int  length;  /*lengdi  of  "length"  is  gotten  fiom 

*  lengd)_size  in  leader  */ 

int  position:  I*  length  is  gotten  fiom  position_size 

*in  leader*/ 

struct  dr  jentry  *iKxt; 

}  drjentiy; 


I* 

*  This  is  die  user  data  area  (uda)  of  the  dc 

*  The  length  of  the  uda  list  will  be  the  same  as  the  length  of  the 
*drjentiy  list  above.  Each  emry  in  die  uda  is  also  teimiiuued 

*  with  a  FIELD.TERM  (Ctrl-"). 


*  The  only  dung  that  might  have  to  be 

*  handled  qwcially  in  here  is  if  amys  are  defined  by  an  array 

*  descriptor  in  die  uda;  a  strange  beast  that  is  just  like  an  amy 

*  descriptor  as  may  be  found  in  die  dda  label  fidd  excqjt  diat  it 

*  has  its  fields  separated  by  11N1T_TERM  (ctd-_)  instead  of  commas. 
*/ 


typedef  struct  udajentry 
{ 


duff  *fieldjtag; 
dun*  *vec_tag: 
duff  typt; 
unkm  { 
diar*qr, 
inti; 

doubled; 


lengdi  is  up  to  fidd  tominator  */ 
length  is  iq>  to  next  vector  item  */ 
/•AJJl.S,C3.orX*/ 

C31AR  (actually  a  string)  */ 

/•INT*/ 

/•  FLOAT,  EXP J=LOAr  */ 
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int  *bf.  f*  BITFIEU).  CHAR_BIT_STRING  */ 

void  •ignore;  /•  IGNORE  •/ 

}data;  user  data.  •/ 

stnict  uda_entiy  •next; 

)  uda^entiy; 


extern  char  •mallocQ;  /•  Have  to  put  this  somewhere.  •/ 

extern  format  •foimatlist;  f*  gkrtral  pointer  where  format  list  goes  •/ 

extern  format  •repeatlist;  /•  global  pointer  for  format  list  repeat  •/ 

/• 

•  An  die  public  fiincdons 

•/ 

extern  struct  ddr.leader  •parse_ddr_leaderO; 
extern  struct  ddr.entry  •parse.ddr.directoryO: 
extern  struct  dda_entry  •parse.ddaO; 
extern  struct  dda_entry  *parse_ddrO; 
extern  struct  drjeader  •parsejdrJeaderO; 
extern  stnict  dr.entry  •{Kuse.dr.directoryO; 
extern  struct  dr.entry  •parse.diO; 

#endif  /•IS08211_H^/ 
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